52 research outputs found

    An Efficient Cell List Implementation for Monte Carlo Simulation on GPUs

    Full text link
    Maximizing the performance potential of the modern day GPU architecture requires judicious utilization of available parallel resources. Although dramatic reductions can often be obtained through straightforward mappings, further performance improvements often require algorithmic redesigns to more closely exploit the target architecture. In this paper, we focus on efficient molecular simulations for the GPU and propose a novel cell list algorithm that better utilizes its parallel resources. Our goal is an efficient GPU implementation of large-scale Monte Carlo simulations for the grand canonical ensemble. This is a particularly challenging application because there is inherently less computation and parallelism than in similar applications with molecular dynamics. Consistent with the results of prior researchers, our simulation results show traditional cell list implementations for Monte Carlo simulations of molecular systems offer effectively no performance improvement for small systems [5, 14], even when porting to the GPU. However for larger systems, the cell list implementation offers significant gains in performance. Furthermore, our novel cell list approach results in better performance for all problem sizes when compared with other GPU implementations with or without cell lists.Comment: 30 page

    Improved fault recovery for core based trees

    No full text
    The demand for multicast communication in wide-area networks, such as the internet, is increasing. Core based trees is one protocol that has been proposed to support scalable multicasting for sparse groups. When faults occur in the network nodes or links of the tree, the tree can become disconnected. In this paper, we propose an efficient protocol for recovering from faults in a core based tree. One of the key ideas is a technique for restructuring the disconnected subtree so that a loop-free path to the core can be found. The correctness of this protocol is also proved

    The Impact of Output Selection Function Choice on the Performance of Adaptive Wormhole Routing

    No full text
    Many adaptive routing algorithms have been proposed for wormhole-routed interconnection networks. Comparatively little work, however, has been done on determining how the output selection function (routing policy) affects the performance of an adaptive routing algorithm. In this paper, we present a detailed simulation study of various selection functions for a fully adaptive mesh routing algorithm. The simulation results show that the choice of selection function has a significant effect on the average message latency. Thus, a naive implementation of an adaptive routing algorithm may lead to poor performance. These selection functions are also compared with a theoretically optimal selection function [1]. We show that although theoretically optimal, the actual performance of the optimal selection function is not best. An explanation and interpretation of the results is provided.
    • …
    corecore